Rethinking Text Segmentation Models : An Information Extra tion Case

نویسنده

  • Christopher D. Manning
چکیده

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Hmm Based High Accuracy Off-line Cursive Handwriting Recognition by a Baseline Detection Error Tolerant Feature Extraction Approach

HANDWRITING RECOGNITION BY A BASELINE DETECTION ERROR TOLERANT FEATURE EXTRACTION APPROACH W. WANG, A. BRAKENSIEK, A. KOSMALA, G. RIGOLL Dept. of Computer S ien e, Fa ulty of Ele tri al Engineering Gerhard-Mer ator-University Duisburg Bismar kstr. 90, 47057 Duisburg, Germany E-mail: fwwwang, anja, kosmala, rigollg fb9-ti.uni-duisburg.de Hidden Markov Models (HMMs) an model the similarity and va...

متن کامل

Using Information Extraction to Aid the Discovery of Prediction Rules from Text

ABSTRACT Text mining and Information Extra tion (IE) are both topi s of signi ant re ent interest. Text mining on erns applying data mining, a.k.a. knowledge dis overy from databases (KDD) te hniques to unstru tured text. Information extra tion (IE) is a form of shallow text understanding that lo ates spe i pie es of data in natural language do uments, transforming unstru tured text into a stru...

متن کامل

Machine Learning for Information Extraction from XML marked-up text on the Semantic Web

The last few years have seen an explosion in the amount of text be oming available on the World Wide Web as online ommunities of users in diverse domains emerge to share do uments and other digital resour es. In this paper we explore the issue of how to provide a low-level information extra tion tool based on hidden Markov models that an identify and lassify terminology based on previously mark...

متن کامل

Controlling Bidirectional Parsing

Traditional models of parsing as used in interfaces have shown to be weak and ine ective in complex tasks such as processing of naturally-occurring texts. Broad coverage parsers go mad when confronted with extended inputs without su cient information to control the interpretation process. The use of e ective control strategies is necessary to overcome these shortcomings. Extra-linguistic criter...

متن کامل

A Modified Character Segmentation Algorithm for Farsi Printed Text Using Upper Contour Labelling

In this paper, a modified segmentation algorithm for printed Farsi words is presented. This algorithm is based on a previous work by Azmi that uses the conditional labeling of the upper contour to find the segmentation points. The main objective is to improve the segmentation results for low quality prints. To achieve this, various modifications on local baseline detection, contour labeling an...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007